1. Field of the Invention
Systems and methods consistent with the principles of the invention relate generally to information retrieval and, more particularly, to the providing of information that is relevant to a particular document.
2. Description of Related Art
Modern computer networks, and in particular, the Internet, have made large bodies of information widely and easily available. Free Internet search engines, for instance, index many millions of web documents that are linked to the Internet. A user connected to the Internet can enter a simple search query to quickly locate web documents relevant to the search query.
One category of content that is not widely available on the Internet, however, includes the more traditional printed works of authorship, such as books and magazines. One impediment to making such works digitally available is that it can be difficult to convert printed versions of the works to digital form. Optical character recognition (OCR), which is the act of using an optical scanning device to generate images of text that are then converted to characters in a computer readable format (e.g., an ASCII file), is a known technique for converting printed text to a useful digital form. OCR systems generally include an optical scanner for generating images of printed pages and software for analyzing the images.